In [6]:
import pandas as pd
In [7]:
pd.read_csv(r"C:\Users\MD Talha Mobashshir\Downloads\india-usa_innings_data.csv")
Out[7]:
batter bowler non_striker runs_batter runs_extras runs_total wickets_0_player_out wickets_0_kind team over ... wickets_0_fielders_0_name review_by review_umpire review_batter review_decision review_type extras_legbyes wickets_0_fielders_1_name extras_noballs extras_penalty
0 Shayan Jahangir Arshdeep Singh SR Taylor 0 0 0 Shayan Jahangir lbw United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 AGS Gous Arshdeep Singh SR Taylor 0 0 0 NaN NaN United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 AGS Gous Arshdeep Singh SR Taylor 0 0 0 NaN NaN United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 AGS Gous Arshdeep Singh SR Taylor 0 1 1 NaN NaN United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 AGS Gous Arshdeep Singh SR Taylor 2 0 2 NaN NaN United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
231 SA Yadav SN Netravalkar S Dube 0 0 0 NaN NaN India 17 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
232 SA Yadav SN Netravalkar S Dube 1 0 1 NaN NaN India 17 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
233 SA Yadav Ali Khan S Dube 1 0 1 NaN NaN India 18 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
234 S Dube Ali Khan SA Yadav 0 1 1 NaN NaN India 18 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
235 S Dube Ali Khan SA Yadav 2 0 2 NaN NaN India 18 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

236 rows × 21 columns

In [8]:
data=pd.read_csv(r"C:\Users\MD Talha Mobashshir\Downloads\india-usa_innings_data.csv")
In [9]:
data.head()
Out[9]:
batter bowler non_striker runs_batter runs_extras runs_total wickets_0_player_out wickets_0_kind team over ... wickets_0_fielders_0_name review_by review_umpire review_batter review_decision review_type extras_legbyes wickets_0_fielders_1_name extras_noballs extras_penalty
0 Shayan Jahangir Arshdeep Singh SR Taylor 0 0 0 Shayan Jahangir lbw United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 AGS Gous Arshdeep Singh SR Taylor 0 0 0 NaN NaN United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 AGS Gous Arshdeep Singh SR Taylor 0 0 0 NaN NaN United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 AGS Gous Arshdeep Singh SR Taylor 0 1 1 NaN NaN United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 AGS Gous Arshdeep Singh SR Taylor 2 0 2 NaN NaN United States of America 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 21 columns

In [5]:
data.shape
Out[5]:
(236, 21)
In [6]:
data.columns
Out[6]:
Index(['batter', 'bowler', 'non_striker', 'runs_batter', 'runs_extras',
       'runs_total', 'wickets_0_player_out', 'wickets_0_kind', 'team', 'over',
       'extras_wides', 'wickets_0_fielders_0_name', 'review_by',
       'review_umpire', 'review_batter', 'review_decision', 'review_type',
       'extras_legbyes', 'wickets_0_fielders_1_name', 'extras_noballs',
       'extras_penalty'],
      dtype='object')
In [12]:
missing_values=data.isnull().sum()   #checking for missing values in the dataset
In [13]:
data_types=data.dtypes   #checking the data types of the columns
In [14]:
missing_values
Out[14]:
batter                         0
bowler                         0
non_striker                    0
runs_batter                    0
runs_extras                    0
runs_total                     0
wickets_0_player_out         225
wickets_0_kind               225
team                           0
over                           0
extras_wides                 231
wickets_0_fielders_0_name    228
review_by                    235
review_umpire                235
review_batter                235
review_decision              235
review_type                  235
extras_legbyes               234
wickets_0_fielders_1_name    235
extras_noballs               235
extras_penalty               235
dtype: int64
In [15]:
data_types
Out[15]:
batter                        object
bowler                        object
non_striker                   object
runs_batter                    int64
runs_extras                    int64
runs_total                     int64
wickets_0_player_out          object
wickets_0_kind                object
team                          object
over                           int64
extras_wides                 float64
wickets_0_fielders_0_name     object
review_by                     object
review_umpire                 object
review_batter                 object
review_decision               object
review_type                   object
extras_legbyes               float64
wickets_0_fielders_1_name     object
extras_noballs               float64
extras_penalty               float64
dtype: object
In [11]:
#the data has null values in various columns but in such dataset, even null values have a meaning,so it ia left as it is
  

Grouping the data¶

In [16]:
#total runs scored by each team
data.groupby("team")["runs_total"].sum()
Out[16]:
team
India                       111
United States of America    110
Name: runs_total, dtype: int64
In [17]:
total_runs=data.groupby("team")["runs_total"].sum()
In [18]:
total_runs
Out[18]:
team
India                       111
United States of America    110
Name: runs_total, dtype: int64
In [15]:
#total wickets taken by each team
data["wickets_0_player_out"].notna().groupby(data["team"]).sum()
Out[15]:
team
India                       3
United States of America    8
Name: wickets_0_player_out, dtype: int64
In [19]:
total_wickets=data["wickets_0_player_out"].notna().groupby(data["team"]).sum()
In [20]:
total_wickets
Out[20]:
team
India                       3
United States of America    8
Name: wickets_0_player_out, dtype: int64
In [22]:
# total extras by each team
data[["team","runs_extras","extras_wides","extras_legbyes","extras_noballs","extras_penalty"]].groupby("team").sum()
Out[22]:
runs_extras extras_wides extras_legbyes extras_noballs extras_penalty
team
India 9 2.0 1.0 1.0 5.0
United States of America 8 7.0 1.0 0.0 0.0
In [23]:
total_extras=data[["team","runs_extras","extras_wides","extras_legbyes","extras_noballs","extras_penalty"]].groupby("team").sum()
In [20]:
# runs scored by each batter
data.groupby("batter")["runs_batter"].sum()
Out[20]:
batter
AGS Gous             2
Aaron Jones         11
CJ Anderson         15
Harmeet Singh       10
Jasdeep Singh        2
NR Kumar            27
RG Sharma            3
RR Pant             18
S Dube              31
SA Yadav            50
SC van Schalkwyk    11
SR Taylor           24
Shayan Jahangir      0
V Kohli              0
Name: runs_batter, dtype: int64
In [24]:
batter_runs=data.groupby("batter")["runs_batter"].sum()
In [25]:
# balls faced by each batter
data.groupby("batter").size()
Out[25]:
batter
AGS Gous             6
Aaron Jones         22
CJ Anderson         12
Harmeet Singh       10
Jasdeep Singh        7
NR Kumar            24
RG Sharma            6
RR Pant             20
S Dube              37
SA Yadav            49
SC van Schalkwyk    10
SR Taylor           31
Shayan Jahangir      1
V Kohli              1
dtype: int64
In [26]:
balls_faced=data.groupby("batter").size()
In [27]:
#strike rate of each batter
strike_rate=(batter_runs/balls_faced)*100
In [28]:
strike_rate
Out[28]:
batter
AGS Gous             33.333333
Aaron Jones          50.000000
CJ Anderson         125.000000
Harmeet Singh       100.000000
Jasdeep Singh        28.571429
NR Kumar            112.500000
RG Sharma            50.000000
RR Pant              90.000000
S Dube               83.783784
SA Yadav            102.040816
SC van Schalkwyk    110.000000
SR Taylor            77.419355
Shayan Jahangir       0.000000
V Kohli               0.000000
dtype: float64
In [29]:
#boundaries hit by each batter
data[(data["runs_batter"]==4) | (data["runs_batter"]==6)].groupby(["batter","runs_batter"]).size().unstack(fill_value=0)
Out[29]:
runs_batter 4 6
batter
Aaron Jones 0 1
CJ Anderson 1 1
Harmeet Singh 0 1
NR Kumar 2 1
RR Pant 1 1
S Dube 1 1
SA Yadav 2 2
SC van Schalkwyk 1 0
SR Taylor 0 2
In [30]:
boundaries=data[(data["runs_batter"]==4) | (data["runs_batter"]==6)].groupby(["batter","runs_batter"]).size().unstack(fill_value=0)
In [31]:
#wickets taken by each bowler
data["wickets_0_player_out"].notna().groupby(data["bowler"]).sum()
Out[31]:
bowler
AR Patel            1
Ali Khan            1
Arshdeep Singh      4
CJ Anderson         0
HH Pandya           2
JJ Bumrah           0
Jasdeep Singh       0
Mohammed Siraj      1
S Dube              0
SC van Schalkwyk    0
SN Netravalkar      2
Name: wickets_0_player_out, dtype: int64
In [32]:
wickets_taken=data["wickets_0_player_out"].notna().groupby(data["bowler"]).sum()
In [33]:
#runs conceded by each bowler
data.groupby("bowler")["runs_total"].sum()
Out[33]:
bowler
AR Patel            25
Ali Khan            22
Arshdeep Singh       9
CJ Anderson         22
HH Pandya           15
JJ Bumrah           25
Jasdeep Singh       24
Mohammed Siraj      25
S Dube              11
SC van Schalkwyk    25
SN Netravalkar      18
Name: runs_total, dtype: int64
In [37]:
runs_conceded=data.groupby("bowler")["runs_total"].sum()
In [38]:
#balls bowled by each bowler
data.groupby("bowler").size()
Out[38]:
bowler
AR Patel            19
Ali Khan            21
Arshdeep Singh      25
CJ Anderson         19
HH Pandya           24
JJ Bumrah           25
Jasdeep Singh       25
Mohammed Siraj      24
S Dube               6
SC van Schalkwyk    24
SN Netravalkar      24
dtype: int64
In [39]:
balls_bowled=data.groupby("bowler").size()
In [40]:
#economy rate of each bowler
economy_rate=runs_conceded/(balls_bowled/6)
In [35]:
economy_rate
Out[35]:
bowler
AR Patel             7.894737
Ali Khan             6.285714
Arshdeep Singh       2.160000
CJ Anderson          6.947368
HH Pandya            3.750000
JJ Bumrah            6.000000
Jasdeep Singh        5.760000
Mohammed Siraj       6.250000
S Dube              11.000000
SC van Schalkwyk     6.250000
SN Netravalkar       4.500000
dtype: float64
In [41]:
#dot balls bowled by each bowler
data[data["runs_total"]==0].groupby("bowler").size()
Out[41]:
bowler
AR Patel             5
Ali Khan             7
Arshdeep Singh      17
CJ Anderson          8
HH Pandya           18
JJ Bumrah           14
Jasdeep Singh       11
Mohammed Siraj      11
S Dube               3
SC van Schalkwyk     8
SN Netravalkar      13
dtype: int64
In [42]:
dot_balls=data[data["runs_total"]==0].groupby("bowler").size()

Combining all these stats into dataframe for batters and bowlers¶

In [43]:
#batter stats
batter_stats=pd.DataFrame({
    "Runs": batter_runs,
    "Balls Faced": balls_faced,
    "Strike Rate": strike_rate
    }).join(boundaries)
In [39]:
batter_stats
Out[39]:
Runs Balls Faced Strike Rate 4 6
batter
AGS Gous 2 6 33.333333 NaN NaN
Aaron Jones 11 22 50.000000 0.0 1.0
CJ Anderson 15 12 125.000000 1.0 1.0
Harmeet Singh 10 10 100.000000 0.0 1.0
Jasdeep Singh 2 7 28.571429 NaN NaN
NR Kumar 27 24 112.500000 2.0 1.0
RG Sharma 3 6 50.000000 NaN NaN
RR Pant 18 20 90.000000 1.0 1.0
S Dube 31 37 83.783784 1.0 1.0
SA Yadav 50 49 102.040816 2.0 2.0
SC van Schalkwyk 11 10 110.000000 1.0 0.0
SR Taylor 24 31 77.419355 0.0 2.0
Shayan Jahangir 0 1 0.000000 NaN NaN
V Kohli 0 1 0.000000 NaN NaN
In [45]:
#bowlers_stats
bowlers_stats=pd.DataFrame({
    "Wickets":wickets_taken,
    "Runs conceded":runs_conceded,
    "Balls Bowled":balls_bowled,
    "Economy rate":economy_rate,
    "Dot balls":dot_balls})
In [46]:
bowlers_stats
Out[46]:
Wickets Runs conceded Balls Bowled Economy rate Dot balls
bowler
AR Patel 1 25 19 7.894737 5
Ali Khan 1 22 21 6.285714 7
Arshdeep Singh 4 9 25 2.160000 17
CJ Anderson 0 22 19 6.947368 8
HH Pandya 2 15 24 3.750000 18
JJ Bumrah 0 25 25 6.000000 14
Jasdeep Singh 0 24 25 5.760000 11
Mohammed Siraj 1 25 24 6.250000 11
S Dube 0 11 6 11.000000 3
SC van Schalkwyk 0 25 24 6.250000 8
SN Netravalkar 2 18 24 4.500000 13
In [47]:
total_runs
Out[47]:
team
India                       111
United States of America    110
Name: runs_total, dtype: int64
In [48]:
total_extras
Out[48]:
runs_extras extras_wides extras_legbyes extras_noballs extras_penalty
team
India 9 2.0 1.0 1.0 5.0
United States of America 8 7.0 1.0 0.0 0.0
In [49]:
total_wickets
Out[49]:
team
India                       3
United States of America    8
Name: wickets_0_player_out, dtype: int64

Run progression over Overs¶

In [50]:
import plotly.graph_objects as go
In [51]:
data[data["team"]=="India"].groupby("over")["runs_total"].sum().cumsum()
Out[51]:
over
0       2
1      10
2      12
3      16
4      25
5      33
6      36
7      39
8      41
9      47
10     53
11     55
12     60
13     67
14     81
15     87
16    102
17    107
18    111
Name: runs_total, dtype: int64
In [52]:
india_runs_progression=data[data["team"]=="India"].groupby("over")["runs_total"].sum().cumsum()
In [53]:
usa_runs_progression=data[data["team"]=="United States of America"].groupby("over")["runs_total"].sum().cumsum()
In [54]:
fig=go.Figure()
fig.add_trace(go.Scatter(x=india_runs_progression.index,y=india_runs_progression.values,mode="lines+markers", name="India"))
fig.add_trace(go.Scatter(x=usa_runs_progression.index,y=usa_runs_progression.values,mode="lines+markers", name="USA"))
fig.update_layout(title="Runs Progression Over Overs",xaxis_title="Overs",yaxis_title="Cumulative Runs",legend_title="Teams",template="plotly_white")
fig.show()
The graph shows the progression of the cumulative run over the overs for both India and the USA in their T20 World Cup match. Initially, both teams had a steady run rate, with India slightly ahead in the early overs. As the innings progressed, USA gained momentum and took the lead briefly around the middle overs. However, India accelerated their scoring in the later overs, surpassing the USA and maintaining the lead until the end. The key takeaway is India’s strong finish, which enabled them to secure the win by consistently increasing their run rate in the final overs.¶

Wickets Timeline¶

In [55]:
data[(data["team"]=="India") & (data["wickets_0_player_out"].notna())].groupby("over").size()
Out[55]:
over
0    1
2    1
7    1
dtype: int64
In [56]:
india_wickets=data[(data["team"]=="India") & (data["wickets_0_player_out"].notna())].groupby("over").size()
In [57]:
usa_wickets=data[(data["team"]=="United States of America") & (data["wickets_0_player_out"].notna())].groupby("over").size()
In [58]:
fig=go.Figure()
fig.add_trace(go.Bar(x=india_wickets.index,y=india_wickets.values,name="India",marker_color="blue",opacity=0.7))
fig.add_trace(go.Bar(x=usa_wickets.index,y=usa_wickets.values,name="USA",marker_color="red",opacity=0.7))
fig.update_layout(title="Wickets Timeline",xaxis_title="Overs",yaxis_title="No. of Wickets",barmode="group",template="plotly_white",legend_title="Teams")
fig.show()
The wickets timeline graph illustrates the distribution of wickets taken over the overs for both India and the USA. The USA lost wickets more frequently, especially in the early overs, with two wickets falling in the first over, followed by consistent wicket losses throughout their innings. In contrast, India experienced their wicket losses more evenly spread across their innings, with a couple of early wickets but maintaining longer partnerships in the middle overs. The frequent loss of wickets by the USA disrupted their momentum, while India’s ability to avoid clusters of wickets falling in succession helped them maintain a steady scoring rate and ultimately secure the win.¶

Run Distribution by Batters¶

In [59]:
import plotly.express as px
In [60]:
fig=px.bar(batter_stats,x=batter_stats.index,y="Runs",title="Run Distribution by Batters",labels={"x":"Batter","Runs":"Runs Scored"},template="plotly_white")
fig.update_layout(xaxis_title="Batter",yaxis_title="Runs Scored",xaxis=dict(tickangle=90))
fig.show()
Notably, S. A. Yadav emerged as the highest scorer with a significant contribution, followed by NR Kumar and S. Dube. These three players were pivotal in their team’s innings, providing the bulk of the runs.¶

Bowling Performance¶

In [61]:
fig= go.Figure()
fig.add_trace(go.Scatter(x=bowlers_stats["Economy rate"],y=bowlers_stats["Wickets"],mode="markers+text",
                         text=bowlers_stats.index,textposition="top center",textfont=dict(family="sans serif",size=12,color="black"),
                         marker=dict(color="red",size=10),
                         name="Bowlers"))
fig.update_layout(title="Bowling Perfomance",xaxis_title="Economy Rate",
                  yaxis_title="Wickets Taken",template="plotly_white",autosize=False,width=800,height=600)
fig.show()
The bowling performance graph compares the economy rate and wickets taken by various bowlers in the match between India and the USA. Arshdeep Singh stands out as the most effective bowler, taking the highest number of wickets (4) with a commendable economy rate. Other notable performances include HH Pandya and SN Netravalkar, both taking 2 wickets each with moderate economy rates. Bowlers like S Dube, having a higher economy rate, contributed less in terms of wickets.¶

Partnership Contribution-India¶

In [62]:
#separate data for India and USA
data[data["team"]=="India"].groupby(["over","batter","non_striker"])["runs_total"].sum().reset_index()
Out[62]:
over batter non_striker runs_total
0 0 RG Sharma V Kohli 1
1 0 RR Pant RG Sharma 1
2 0 V Kohli RG Sharma 0
3 1 RG Sharma RR Pant 2
4 1 RR Pant RG Sharma 6
5 2 RG Sharma RR Pant 0
6 2 RR Pant SA Yadav 1
7 2 SA Yadav RR Pant 1
8 3 RR Pant SA Yadav 2
9 3 SA Yadav RR Pant 2
10 4 RR Pant SA Yadav 0
11 4 SA Yadav RR Pant 9
12 5 RR Pant SA Yadav 6
13 5 SA Yadav RR Pant 2
14 6 RR Pant SA Yadav 1
15 6 SA Yadav RR Pant 2
16 7 RR Pant SA Yadav 2
17 7 S Dube SA Yadav 0
18 7 SA Yadav RR Pant 1
19 8 S Dube SA Yadav 1
20 8 SA Yadav S Dube 1
21 9 S Dube SA Yadav 3
22 9 SA Yadav S Dube 3
23 10 S Dube SA Yadav 6
24 11 S Dube SA Yadav 1
25 11 SA Yadav S Dube 1
26 12 S Dube SA Yadav 3
27 12 SA Yadav S Dube 2
28 13 S Dube SA Yadav 2
29 13 SA Yadav S Dube 5
30 14 S Dube SA Yadav 13
31 14 SA Yadav S Dube 1
32 15 S Dube SA Yadav 4
33 15 SA Yadav S Dube 2
34 16 S Dube SA Yadav 1
35 16 SA Yadav S Dube 14
36 17 S Dube SA Yadav 1
37 17 SA Yadav S Dube 4
38 18 S Dube SA Yadav 3
39 18 SA Yadav S Dube 1
In [63]:
india_partnership_data=data[data["team"]=="India"].groupby(["over","batter","non_striker"])["runs_total"].sum().reset_index()
In [64]:
usa_partnership_data=data[data["team"]=="United States of America"].groupby(["over","batter","non_striker"])["runs_total"].sum().reset_index()
In [65]:
#creating pivot tables for better visualization
india_partnership_data.pivot(index="over",columns=["batter","non_striker"],values="runs_total").fillna(0)
Out[65]:
batter RG Sharma RR Pant V Kohli RG Sharma RR Pant SA Yadav S Dube SA Yadav
non_striker V Kohli RG Sharma RG Sharma RR Pant SA Yadav RR Pant SA Yadav S Dube
over
0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.0 6.0 0.0 2.0 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0
3 0.0 0.0 0.0 0.0 2.0 2.0 0.0 0.0
4 0.0 0.0 0.0 0.0 0.0 9.0 0.0 0.0
5 0.0 0.0 0.0 0.0 6.0 2.0 0.0 0.0
6 0.0 0.0 0.0 0.0 1.0 2.0 0.0 0.0
7 0.0 0.0 0.0 0.0 2.0 1.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0
9 0.0 0.0 0.0 0.0 0.0 0.0 3.0 3.0
10 0.0 0.0 0.0 0.0 0.0 0.0 6.0 0.0
11 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0
12 0.0 0.0 0.0 0.0 0.0 0.0 3.0 2.0
13 0.0 0.0 0.0 0.0 0.0 0.0 2.0 5.0
14 0.0 0.0 0.0 0.0 0.0 0.0 13.0 1.0
15 0.0 0.0 0.0 0.0 0.0 0.0 4.0 2.0
16 0.0 0.0 0.0 0.0 0.0 0.0 1.0 14.0
17 0.0 0.0 0.0 0.0 0.0 0.0 1.0 4.0
18 0.0 0.0 0.0 0.0 0.0 0.0 3.0 1.0
In [66]:
india_partnership_pivot=india_partnership_data.pivot(index="over",columns=["batter","non_striker"],values="runs_total").fillna(0)
usa_partnership_pivot=usa_partnership_data.pivot(index="over",columns=["batter","non_striker"],values="runs_total").fillna(0)
In [67]:
#converting the pivot table to a long format
#resetting the index first
india_partnership_pivot_reset = india_partnership_pivot.reset_index()

# Flatten the MultiIndex columns to ensure 'over' is a regular column
india_partnership_pivot_reset.columns = india_partnership_pivot_reset.columns.map('_'.join).str.strip('_')
In [68]:
# Converting the pivot table to a long format
india_partnership_long = india_partnership_pivot_reset.melt(id_vars=["over"], var_name="batter_non_striker", value_name="runs_total")

# Splitting the combined 'batter_non_striker' column into 'batter' and 'non_striker'
india_partnership_long[['batter', 'non_striker']] = india_partnership_long['batter_non_striker'].str.split('_', expand=True)
india_partnership_long = india_partnership_long.drop(columns=['batter_non_striker'])
In [69]:
india_partnership_long
Out[69]:
over runs_total batter non_striker
0 0 1.0 RG Sharma V Kohli
1 1 0.0 RG Sharma V Kohli
2 2 0.0 RG Sharma V Kohli
3 3 0.0 RG Sharma V Kohli
4 4 0.0 RG Sharma V Kohli
... ... ... ... ...
147 14 1.0 SA Yadav S Dube
148 15 2.0 SA Yadav S Dube
149 16 14.0 SA Yadav S Dube
150 17 4.0 SA Yadav S Dube
151 18 1.0 SA Yadav S Dube

152 rows × 4 columns

In [70]:
#Creating a stacked bar chart
fig=go.Figure()
#adding bars for each partnership
for (batter,non_striker) in india_partnership_pivot.columns:
    partnership_data=india_partnership_long[(india_partnership_long["batter"]==batter)  & (india_partnership_long["non_striker"]==non_striker)]
    fig.add_trace(go.Bar(x=partnership_data["over"],y=partnership_data["runs_total"],name=f'{batter} & {non_striker}'))
fig.update_layout(title="Partnership Contributions - India",xaxis_title="Over",yaxis_title="Runs",barmode="stack",template="plotly_white",
                  legend_title="Partnership",legend=dict(x=1.05,y=1,traceorder="normal",font=dict(size=10)),autosize=False,width=900,height=600)
fig.show()
The partnership contributions graph for India shows the runs scored by various batting partnerships over each over. Notably, the partnerships of RG Sharma & RR Pant and SA Yadav & S Dube were particularly productive, especially in the middle and death overs, contributing significantly to the team’s total.¶

Partnership Contribution- USA¶

In [71]:
#resetting the index first
usa_partnership_pivot_reset = usa_partnership_pivot.reset_index()

# Flatten the MultiIndex columns to ensure 'over' is a regular column
usa_partnership_pivot_reset.columns = usa_partnership_pivot_reset.columns.map('_'.join).str.strip('_')
In [72]:
# Converting the pivot table to a long format
usa_partnership_long = usa_partnership_pivot_reset.melt(id_vars=["over"], var_name="batter_non_striker", value_name="runs_total")

# Splitting the combined 'batter_non_striker' column into 'batter' and 'non_striker'
usa_partnership_long[['batter', 'non_striker']] = usa_partnership_long['batter_non_striker'].str.split('_', expand=True)
usa_partnership_long = usa_partnership_long.drop(columns=['batter_non_striker'])
In [73]:
#creating a stacked bar chart
fig=go.Figure()
#adding bars for each partnership
for (batter,non_striker) in usa_partnership_pivot.columns:
    partnership_data=usa_partnership_long[(usa_partnership_long["batter"]==batter) & (usa_partnership_long["non_striker"]==non_striker)]
    fig.add_trace(go.Bar(x=partnership_data["over"],y=partnership_data["runs_total"],name=f'{batter} & {non_striker}'))
fig.update_layout(title="Partnership Contributions - USA",xaxis_title="Over",yaxis_title="Runs",barmode="stack",template="plotly_white",
                  legend_title="Partnership",legend=dict(x=1.05,y=1,traceorder="normal",font=dict(size=10)),autosize=False,width=900,height=600)
fig.show()
The partnership contributions graph for the USA highlights the runs scored by different batting pairs over each over. Key partnerships such as SR Taylor & Aaron Jones and NR Kumar & SR Taylor significantly boosted the scoring, particularly in the middle and late overs. However, the contributions are more sporadic compared to India, with several partnerships contributing only marginally.¶

USA key moments in Innings¶

In [74]:
#cumulative runs for both teams over the overs
data[data["team"]=="India"].groupby("over")["runs_total"].sum().cumsum()
Out[74]:
over
0       2
1      10
2      12
3      16
4      25
5      33
6      36
7      39
8      41
9      47
10     53
11     55
12     60
13     67
14     81
15     87
16    102
17    107
18    111
Name: runs_total, dtype: int64
In [75]:
india_cumulative_runs=data[data["team"]=="India"].groupby("over")["runs_total"].sum().cumsum()
In [76]:
usa_cumulative_runs=data[data["team"]=="United States of America"].groupby("over")["runs_total"].sum().cumsum()
In [77]:
#extracting key moments where wickets fell or significant runs were scored
india_key_moments=data[(data["team"]=="India") & data["wickets_0_player_out"].notna()]
usa_key_moments=data[(data["team"]=="United States of America") & data["wickets_0_player_out"].notna()]
In [78]:
#significant runs scored by each team
india_significant_runs=data[(data["team"]=="India") & (data["runs_total"]>=4)]
usa_significant_runs=data[(data["team"]=="United States of America") & (data["runs_total"]>=4)]
In [79]:
data[(data["team"]=="United States of America") & data["wickets_0_player_out"].notna()].groupby("over").size().cumsum()
Out[79]:
over
0     2
7     3
11    4
14    5
16    6
17    7
19    8
dtype: int64
In [80]:
usa_wickets_fall=data[(data["team"]=="United States of America") & data["wickets_0_player_out"].notna()].groupby("over").size().cumsum()
In [81]:
fig=go.Figure()
fig.add_trace(go.Scatter(x=usa_cumulative_runs.index,
                         y=usa_cumulative_runs.values,
                         mode="lines+markers",
                         name="USA Cumulative Runs",
                         line=dict(color="blue")))
fig.add_trace(go.Scatter(x=usa_wickets_fall.index,
                         y=usa_cumulative_runs.loc[usa_wickets_fall.index],
                         mode="markers",
                         name="USA wickets",
                         marker=dict(color="red",size=10)))
# Add annotations for key moments
for _, row in usa_key_moments.iterrows():
    fig.add_annotation(x=row["over"],y=usa_cumulative_runs.loc[row["over"]],
                       text=f"{row['batter']} ({row['over']})",
                       showarrow=True,arrowhead=2,ax=row["over"],ay=usa_cumulative_runs.loc[row["over"]] + 5,
                       arrowcolor="black")
fig.update_layout(title="USA Keys Moments in Innings",xaxis_title="Overs",yaxis_title="Cumulative Runs",template="plotly_white",
                  legend_title="USA Innings",autosize=False,width=900,height=600)
fig.show()
The graph highlights the key moments in the USA’s innings, showing the progression of the cumulative run with wickets marked. Early wickets, such as those of Shayan Jahangir and AGS Gous in the first over, set back the USA’s momentum. Despite recoveries led by partnerships involving SR Taylor and NR Kumar, regular wickets in the middle and late overs, particularly around the 14th to 19th overs, hindered their progress. The dismissals of key players like Aaron Jones, SR Taylor, and later batsmen such as Harmeet Singh and CJ Anderson, prevented the USA from building a substantial and uninterrupted run flow, ultimately impacting their total score.¶

India key moments in Innings¶

In [84]:
india_cumulative_runs=data[data["team"]=="India"].groupby("over")["runs_total"].sum().cumsum()
india_key_moments=data[(data["team"]=="India") & data["wickets_0_player_out"].notna()]
india_significant_runs=data[(data["team"]=="India") & (data["runs_total"]>=4)]
india_wickets_fall=data[(data["team"]=="India") & (data["wickets_0_player_out"].notna())].groupby("over").size().cumsum()
In [96]:
fig=go.Figure()
fig.add_trace(go.Scatter(x=india_cumulative_runs.index,y=india_cumulative_runs.values,mode="lines+markers",
                         name="India Cumulative Runs",line=dict(color="green")))
fig.add_trace(go.Scatter(x=india_wickets_fall.index,y=india_cumulative_runs.loc[india_wickets_fall.index],
                         mode="markers",name="India wickets",marker=dict(color="red",size=10)))
for _,row in india_key_moments.iterrows():
    fig.add_annotation(x=row["over"],y=india_cumulative_runs.loc[row["over"]],
                       text=f'{row["batter"]}({row["over"]})',
                       showarrow=True,
                       arrowhead=2,
                       ax=row["over"],
                       ay=india_cumulative_runs.loc[row["over"]] + 5,
                       arrowcolor="black")
fig.update_layout(title="India Key Moments In Innings",xaxis_title="Overs",yaxis_title="Cumulative Runs",
                  template="plotly_white",legend_title="India Innings",autosize=False,width=900,height=600)
fig.show()
Despite an early setback with the dismissals of V. Kohli and RG Sharma in the first two overs, India managed to maintain a steady run rate. The wicket of RR Pant in the 7th over was another crucial moment, but subsequent partnerships helped stabilize the innings.¶

Avg run rate for both teams¶

In [101]:
data[data["team"]=="India"].groupby("over")["runs_total"].sum().mean()
Out[101]:
5.842105263157895
In [102]:
india_run_rate=data[data["team"]=="India"].groupby("over")["runs_total"].sum().mean()
In [103]:
usa_run_rate=data[data["team"]=="United States of America"].groupby("over")["runs_total"].sum().mean()
In [104]:
fig=go.Figure()
fig.add_trace(go.Bar(x=["India","USA"],y=[india_run_rate,usa_run_rate],marker_color=["green","blue"]))
fig.add_annotation(x="India",y=india_run_rate,text=f"{india_run_rate:2f}",showarrow=False,yshift=10)
fig.add_annotation(x="USA",y=usa_run_rate,text=f"{usa_run_rate:2f}",showarrow=False,yshift=10)
fig.update_layout(title="Comparison of Average Run Rate per Over",xaxis_title="Team",yaxis_title="Average Run Rate per Over",
                  template="plotly_white")
fig.show()
The comparison of average run rate per over shows that India had a higher average run rate of 5.84 compared to the USA’s 5.50. This indicates that India scored runs more efficiently throughout their innings. The higher run rate for India reflects their ability to maintain a steady flow of runs, despite early setbacks, which was crucial in achieving their target. The slightly lower run rate for the USA suggests they struggled to accelerate their scoring, especially in the middle overs, which impacted their overall total.¶

Comparison of Run Rate per Over¶

In [106]:
india_run_rate_per_over=data[data["team"]=="India"].groupby("over")["runs_total"].sum()
usa_run_rate_per_over=data[data["team"]=="United States of America"].groupby("over")["runs_total"].sum()
In [107]:
fig=go.Figure()
fig.add_trace(go.Scatter(x=india_run_rate_per_over.index,y=india_run_rate_per_over.values,mode="lines+markers",
                         name="India Run Rate",line=dict(color="green")))
fig.add_trace(go.Scatter(x=usa_run_rate_per_over.index,y=usa_run_rate_per_over.values,mode="lines+markers",
                         name="USA Run Rate",line=dict(color="blue")))
fig.update_layout(title="Comparison of Run Rate per Over",xaxis_title="Overs",yaxis_title="Runs",
                  template="plotly_white",legend_title="Runrate",autosize=False,width=1000,height=600)
fig.show()
The USA experienced significant fluctuations in their run rate, with peaks in the 10th and 15th overs, but also several low-scoring overs, indicating inconsistency. India’s run rate was relatively more stable, with a notable increase towards the end of their innings. This stability in India’s run rate, especially in the death overs, allowed them to maintain pressure and chase the target successfully. The graph highlights India’s ability to keep a more consistent scoring pace, while the USA’s variable run rate reflects periods of struggle to maintain momentum.¶

Conclusion:¶

In conclusion, India’s strategy of consistent scoring, effective partnerships, and a balanced bowling attack proved successful against the USA’s inconsistent batting performance and less impactful bowling.¶

In [ ]: